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When educators, trainers, and human resource development (HRD) practitioners are 
asked to produce evidence that supports the productivity of the training enterprise, they are 
often frustrated by the lack of simple and effective methods for assessment. Often, they resort 
to simple questionnaires to obtain feedback on the results of their efforts. Sometimes they just 
assume that if the training was based on a needs analysis or if it focused on what the company 
wanted, it was probably effective. However, these methods cannot easily tie training activities 
to the dollar values that are considered important by most organizations. This puts trainers at a 
disadvantage when dealing with their more financially literate colleagues. Administrators in 
the organization are likely to know exactly how much training costs, but they may have little 
idea of its real value. The training enterprise must be able to supply that information. 

The costs of training are usually measured in dollars or translated to dollars, a powerful 
measuring scale that has enormous emotional appeal to managers. Next to a dollar measure of 
cost, questionnaires or assumptions based on a needs analysis often seem like weak 
arguments. What is needed are methods that can show the value of training in terms that 
managers can understand. 
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A number of methods have been used to assess the monetary benefits of training. Some 
looked at the consequences of not training employees, while others involved analyzing 
performance records or costing the training curve under training conditions. A well-known 
method is cost/benefit analysis. During a cost/benefit analysis, training costs are first 
calculated, then some performance values that occurred as a result of training are assessed, and 
an index of benefits is computed. 

Swanson and Gradous (1988) derived performance values from productivity measures 
such as the number of items produced per shift. If more items are produced after training, the 
gain is calculated by subtracting post-training production figures from pre-training production 
figures and converted to dollars to show performance value. The benefit is then found by 
subtracting the cost of the training from the post-training performance value. Unfortunately, 
there does not seem to be a universally accepted method for performing the analysis. 

Whichever methods are used to calculate the benefit of training, they are probably tacit 
admissions that performance improvement efforts are very important to business, and that the 
improvements they show, or do not show, should be considered seriously. 

The training enterprise is indeed big business in America. The American Society of 
Training and Development (1996) reported 1995 data which indicated that employers' total 
expenditures on training were $55.3 billion. By 2000, the total dollars budgeted for formal 
training by U.S. businesses had dropped slightly, down to $54 billion (Training Magazine 
$taff, 2000). However, within the larger context of corporate training, some areas clearly 
stood out as growth centers. The corporate e-learning market was estimated to be $1.1 billion 
in 2000; that figure was expected to reach $11.4 billion by 2003 (Moe & Blodgett, 2000). 

Hartley (2001) noted that cooperate training was expanding rapidly throughout industry. Moe 
and Blodgett estimated that the global e-leaming market was valued at $300 billion and 
projected its expansion to $365 billion by 2003. The 2003 ASTD State of the Industry report 
indicated that in 2002 (the most recent year in which data was available), substantial increases 
in training expenditures occurred (see Table 1). It seemed clear that training and development 
efforts represented significant investments, amounting to 2.2% of payroll in 2002 ($ugrue, 

2003), up from 1.9% in 2001 (Thompson, Koon, Woodwell, & Beauvais, 2002). 


Table 1 

Training and Development Increases from 2001 to 2002 


Category of expenditure 

2001 

2002 

Expenditure as a % of payroll 

1.9% 

2.2% 

Expenditure per employee 

$734 

$826 

Training hour per employee 

24 

28 

E-learning technologies training 
(as a percentage of all training) 

10.5% 

15.4% 


$ource: $ugrue (2003), p. 2. 


In light of these large expenditures, managers of both private and public organizations are 
beginning to more seriously question their ability to evaluate the impact of training. In fact, 
companies want to know what benefits they can reasonably expect from training (Dionne, 
1996). Other authors voiced different perspectives for the increased interest in return on 
investment. Parsons (1995), for example, stated that it has become fashionable to analyze the 
financial costs and benefits of human resource development programs, whereas Holton (1995) 
found that the increasing global competition has led to intense pressure to demonstrate that 
programs are directly contributing to the bottom line of an organization. Phillips (1995) added 
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that the pressure to measure the return on investment (ROI) is increasing. 

With so much invested, or expected to be invested, in corporate training, Blickstein 
(1996) raised the following questions: (a) What does management want from its training 
investment?, and (b) Can management's expectations be met in other than traditional bottom- 
line terms? 

To answer these questions, a brief review of the four-level evaluation system developed 
by Donald Kirkpatrick is useful. In 1959, while teaching at the University of Wisconsin, 

Kirkpatrick proposed a model that classified training outcomes into four levels. The model 
was developed to provide a framework that explains evaluation. Kirkpatrick noted that some 
training professionals believe that evaluation means measuring changes in behavior, and 
others believed that the only real evaluation lies in determining final results. Still others think 
in terms of the comment sheets that participants complete at the end of a program. "They are 
all right - and yet wrong, in that they fail to recognize that all four approaches are parts of 
what we mean by evaluating" (Brown & Seidner, 1998, p. 95). These often-cited levels 
describe a rubric that can be used to evaluate programs. 

Kirkpatrick's Model 


Level 1 Evaluation (Reaction) 

Reaction may be described as how well the trainees like a particular program 
(Kirkpatrick, 1996). This level can be called a measure of customer satisfaction. Evaluations at 
this level measure how those who participate in the program react to it. Brown and Seidner 
(1998) asserted that effective training results in favorable trainee reactions and motivates them 
to learn. Abernathy (1999) summed up Level 1 evaluation by saying that it asks, "Did you like 
the training?" (p. 20). Many business organizations conduct effective evaluation at this level. 
Education tends to use it in postsecondary settings much more than at the primary or 
secondary levels. 

Level 2 Evaluation (Learning) 

The second level of the model can be described as the extent to which participants change 
attitudes, improve knowledge, and/or increase skills as a result of attending the particular 
program (Brown & Seidner, 1998). Pine and Tingley (1993) defined Level 2 in terms of 
measuring the content of the training. To Abernathy (1999), this level answers the question, 
"Did you understand the information and score well on the test?" (p. 20). 

Regardless of how this level of evaluation is defined. Brown and Seidner (1998) 
suggested that the following questions should be asked of the trainees: (a) What knowledge 
was learned?, (b) What skills were developed or improved?, and (c) What attitudes were 
changed? If these objectives are accomplished, a change in behavior can be expected. Since it 
is more difficult and time consuming to measure learning, as compared to measuring trainee 
reactions, they proposed the following guidelines to help measure learning. 

1. Use a control group if practical. 

2. Evaluate knowledge, skills, and/or attitudes both before and after the program. 

3. Use a paper and pencil test to measure skills. 

4. Get 100% response. 

5. Use the results of the evaluation to take appropriate action. 

Educators have refined the second level of evaluation to a fine art, and they use it most 
effectively in all educational settings. Industry uses it less frequently, usually at a less 
sophisticated level. 
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Level 3 Evaluation (Behavior) 

Brown and Seidner (1998) defined behavior as the extent to which the change in behavior 
has occurred as a result of the participants' attending the training session. According to 
Kirkpatrick, several conditions are necessary for a behavior change to occur. The person must 
have a desire to change, work in the right climate, and be rewarded for changing. This level 
answers the question, "Did the training help you do your job better and increase 
performance?" (Abernathy, 1999, p. 20). 

According to Blickstein (1996), the evaluations of Level 1 and Level 2 can be determined 
while the training program is still in session. However, Level 3 (Behavior) evaluation is much 
more difficult and time consuming. Brown and Seider (1998) provided the following 
guidelines for evaluating behavior. 

1. Use a control group if practical. 

2. Allow time for behavior change to take place. 

3. Evaluate both before and after the program if practical. 

4. Survey and/or interview one or more of the following: trainees, their immediate 
supervisors, their subordinates, and others who observe their behavior. 

5. Get 100% response on sampling. 

6. Repeat the evaluation at appropriate times. 

7. Consider cost versus benefits. 

Business and industry focus on this type of evaluation much more frequently than does 
education, with the exception of technical, vocational, and occupational programs, which must 
show performance gains from training. 

Level 4 Evaluation (Results) 

This level indicates the end result from participants attending the program (Brown & 

Seidner, 1998). Parry (1996) believed that this level usually shows up as return on investment 
and thus the dollar value of the benefits of training over and above the cost of the training 
itself. Results-level evaluation answers the question, "Did the company or department increase 
profits, customer satisfaction, and so forth as a result of the training?"(Abernathy, 1999, p.20). 

According to Kirkpatrick's model: 

Final results can include increased production, improved quality, decreased costs, 
reduced frequency and/or severity of accidents, increased sales, reduced turnover, 
and higher profits and return on investment. It has long been thought important to 
recognize that the most desirable approaches to delivering instruction (training) 
are those that are the most effective in terms of results and the most efficient in 
terms of cost. (Parry, 1976, p.74) 

Parry appeared to agree with Blank (1982), who had stated as early as 1982 that "we need 
to strike a balance between effectiveness (does it work?) and efficiency (how much does it 
cost?)" (p. 192). Kirkpatrick (1994) then proposed the following guidelines to help evaluate at 
the results level of the model (these are similar, but not identical to, his proposals for Levels 2 
and 3). 

1. Use a control group if practical. 

2. Allow time for results to be achieved. 

3. Measure both before and after the program if practical. 

4. Repeat the measurement at appropriate times. 

5. Consider cost versus benefits. Be satisfied with evidence if proof is not possible. 
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At this level, business and industry are clearly the most sophisticated users of evaluation. 
Educators have traditionally used surveys, interviews, or anecdotal evidence to assess long- 
term satisfaction with programs; and they do not often try to express their program outputs in 
terms of monetary value. However, technology is available (e.g.. Unemployment Insurance 
[UI] Wage studies) to enable educators to show the long-term economic value of their 
programs (more about this later). 

Even though it has come under criticism (Holton, 1996), the four-level evaluation model 
has been acknowledged as the standard in the training field because of its simplicity and its 
ability to help people think about evaluative criteria (Alliger & Janak, 1989). Kirkpatrick's 
model has provided a vocabulary and a rough taxonomy that has clearly met an organizational 
need, and it has become well known in HRD departments around the country. It has been a 
highly successful framework that has been used in the training evaluation field for nearly 50 
years. Recent research has focused on the charge that the four-level model often stops short of 
reaching meaningful long-term results (Bernthal, 1995). In a follow-up, Phillips (1997a, 
1997b) suggested a number of modifications to the model, including the addition of a fifth 
level that specifically addresses ROI. 

It is probably safe to say that both education and industry have much to contribute to the 
evaluation process. There are lessons to be learned from both sectors across the evaluation 
spectrum. Both are good at conducting Level 1 (Reaction) evaluation. Educational practice can 
teach us how to use Level 2 (Learning) evaluation with greater precision. Both education and 
industry conduct Level 3 (Behavior) evaluation; and even though only 9% to 11% actually 
conduct evaluations at Level 4 (Results), industry does the best job. 

Return on Investment (ROI) 

Return on investment has been a critical issue for trainers and top executives in recent 
years and is a topic frequently listed on meeting agendas. This technique probably should 
receive more emphasis from educators than it has in the past. It has been continuously 
discussed in the literature (Phillips, 1996a), though not without controversy. Eor example, 
some professionals argue that it is not possible to calculate the ROI in education and training 
contexts, while others develop measures and ROI calculations anyway (Phillips, 1997b). Most 
authors appear to think that ROI calculations and related efforts belong in Level 4 evaluation, 
but Phillips (1997a) described examples of ROI calculations at each of the four levels. So, 
what is the current status of ROI in the field of performance improvement? 

Current Status of ROI 

Phillips (1996a) suggested that it is difficult to pinpoint the state of ROI in the profession. 
Some HR managers are reluctant to discuss internal practices, and it is difficult to find case 
studies that specifically list the strategies used by training departments in determining ROI. 
However, a study of more than 40 organizations found that the best companies are measuring 
customer requirements, testing participants, measuring what the client can and will pay for, 
and moving away from justification (Dixon, 1996). Dixon went on to state that one of the 
most striking findings was that none of the best-practice organizations was evaluating 
primarily to justify training or to maintain the training budget. They were selectively 
conducting evaluations at Levels 3 and 4, but not on a regular basis. Geber (1995) noted, with 
Dixon, that it would be impractical for most companies (or educational institutions) to do 
Level 3 and Level 4 evaluations on every single course. 

Recognizing this problem, the American Society for Training and Development (ASTD) 
collected information on this subject from more than 2,000 HRD professionals. The results 
came in the form of two publications that purported to describe how return-on-investment 
measurements were conducted in real- life situations (Phillips, 1994, 1997c). The case studies 
in this project represented a variety of settings, strategies, and approaches in manufacturing, 

http://scholar.llb.vt.edu/ejournals/JITE/v41n3/brauchle.html 


Page 5 of 17 


2/20/2007 


JITE Volume 41, Number 3 - Contemporary Approaches for Assessing Outcomes... by Paul Brauchle and Klaus Sch... Page 6 of 17 

service, and government organizations. Respondents varied from employees to managers 
and specialists in the training field. ROI in the companies studied ranged from 150% to 
2,000%. However, educators were not part of this study, though they probably should have 
been. 

As previously mentioned, it is possible to consider ROI a fifth level to Kirkpatrick's four- 
level model. In addition to the reaction level, learning level, behavior level, and the results 
level, a return-on-investment level would be added that compares the training's monetary 
benefits with the costs. A model had been developed that tracked the steps in measuring ROI 
from collecting post-program data to calculating the actual return (Phillips, 1996a). The model 
can compare training costs to monetary benefits, assuming that all training programs have 
reportable results as well as intangible benefits; and most do have some sort of identifiable 
outcomes. 

Basic Process for Calculating ROI 

Phillips (1996c) provided the following basic process for calculating ROI. 

1. Collect Level 4 evaluation data. 

2. Ask, "Did on-the-job application produce measurable results?" 

3. Isolate the effects of training from other factors that may have contributed to the results. 

4. Convert the results to monetary benefits. 

5. Total the costs of training. 

6. Compare the monetary benefits with the costs. 

When discussing the general concept of ROI, it is important to note the need for 
identifying the desired program outcomes and using the design of the evaluation to inform 
program planning. This is equally true in training and in education. Putting the necessary time 
and resources in return-on-investment analysis makes sense only if one is convinced that the 
training or education program was applied correctly in the right place. 

ROI Evaluation Recommendations in the Literature 

Several recommendations emerged as a result of these and other case studies. The first is 
that "targets should be set for each evaluation" (Phillips, 1996a, p. 44). Some organizations set 
targets for each level of their training programs. For example, some organizations require 
100% of their training programs to be evaluated at Level 1, 40% to 70% of their training 
programs at Level 2, and so forth. When an organization sets targets for accountability, it is 
sending a powerful message to the HRD department about the need for measurement and 
evaluation (Phillips). The need to set targets was part of a common belief that training design 
should occur at the same time as a measurement discussion (Williams, 1996). Simultaneously 
working on training design and measurement planning is more effective because the 
information required for training design is exactly the same information needed for solid 
measurement. 

A third recommendation was that evaluation should be focused on the micro level. 
Respondents focused on a single program or a few tightly integrated programs in order to have 
an effective ROI measurement. However, the results of the study indicated that ROI 
measurement was more effective when applied to a single program that can be linked to a 
direct payoff (Phillips, 1996a). 

A fourth recommendation dealt with the methods of collecting information. Phillips 
(1996) advocated that a variety of methods should be used to collect data. These methods 
could include interviews, focus groups, and questionnaires, but also action plans, contracts, 
and performance monitoring. Respondents used more than one or two practices to collect data 
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because they recognized that programs, settings, and situations were different. In order to 
increase objectivity, Bernthal (1995) had suggested using several different measurement 
methods in the same evaluation or conducting several evaluations using different methods or 
approaches. The key message here is that one size does not fit all. 

A fifth recommendation focused on activities that are necessary precursors to ROI 
evaluation. Before attempting to conduct return-on-investment studies, it is necessary to 
isolate the effects of training from other factors that can affect business results (Phillips, 1996, 

1997b). Most of the time, improvements in job performance are only partially due to training 
programs. Variables other than training, such as trainees' age and work experience, seasonal 
sales patterns, economic changes, shifts in managerial styles, equipment breakdowns, 
customer attitudes, etc., may influence the data and make it difficult to determine the actual 
effect of the training upon the ROI results (Shelton & Alliger, 1993). They further indicated 
that a way to measure the effects of extraneous factors is to compare the results of a control 
group with the results of the trainee group. Trend-line analysis, forecasting, participant 
estimation, supervisor estimation, management estimation, customer input, expert estimation, 
subordinate input, and other factors are additional ways to isolate training's effect on 
performance (Phillips, 1996b). 

A final recommendation advised organizations that had little or no experience in 
calculating ROI to measure only one course at a time (Phillips, 1996a, 1996c). It is not 
practical for most organizations to calculate the ROI on all training programs because the 
result would have a large number of calculations and would be very time consuming. For this 
reason, most organizations take a practical approach to the problem and focus on one or two of 
the most important or popular training programs. If presented well, ROI calculations on just a 
few programs can be powerful. The respondents participating in Phillips's studies were not 
content to show that training can result in such improvements as increased productivity and 
decreased turnover. An additional step was taken to convert these improvements to monetary 
values that could be compared to costs for ROI calculation. For hard-data items, such as 
productivity, quality, and time, the ROI calculation was much easier than those calculations 
for soft-data items, such as customer satisfaction, employee turnover, and job satisfaction 
(Phillips, 1996c). 

The following sections will identify and discuss some approaches that individuals and 
organizations have used to assess individual performance improvement and program 
outcomes. 


Calculating the Return on Investment 

There are two common formulas used to calculate the return on investment (Phillips, 
1996b). The first is benefit/cost ratio (BCR), and the second is ROI. The benefit/cost ratio 
(BCR) can be calculated using the following formula: BCR = program benefits/program costs. 
It uses the total benefits and the total costs to obtain an index of the worthwhile results of the 
training program to its overall cost. ROI is different because it gives a percentage of return on 
the dollars that are invested in training and development. This figure is interpreted in the same 
way the returns of an investment in stocks or mutual funds would be viewed. To get ROI, the 
training costs are subtracted from the total benefits to get the net benefits, and then the net 
benefits are divided by the costs. The formula for this is ROI (%) = net program 
benefits/program costs x 100. Phillips (1996a) gives the following example: Suppose a 
training program produces benefits of $321,600 with a cost of $38,233. The BCR is 8.4. For 
every $1 invested, $8.40 in benefits was returned. The net benefits were $321,600 - $38,233 = 
$283,367. ROI is $283,367/$38,233 = 741. Using the ROI formula, for every $1 invested in 
the program, there was a return of $7.41 in net benefits. 

Another useful method is called payback period (Phillips, 1997a). This technique usually 
makes the assumption that the cash proceeds generated by a training intervention are constant 
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over time, and it calculates the time period needed to pay back the original investment. 

The formula is: Payback period = Total investment/net annual savings. In the example above, 
the total investment is $38,233, and the net benefits are $321,600. If there is no time period 
specified, it can be safely assumed that the net benefits are for a period of one year, because 
budgeting is usually done on an annual basis. Using these figures with the formula produces 
an answer of .1188837 years, or 43 days. In this instance, the original training investment was 
paid back within 43 days. These methods are widely used by industry. They have potential for 
education if the economic value of educational programs can be established. 

The Time Value of Money 

A thorough consideration of a training or performance improvement initiative should also 
include the time-value of the money that is saved (or made). This concept has been used for 
years by production/operations managers (Riggs, 1987), but it is not often found in the 
training evaluation literature. Suppose that ROI calculations indicate that after three years, a 
training intervention will make the company $100,000. $omeone now wants to know the 
present value of those future dollars, knowing intuitively that money to be received in three 
years is worth less than money received now. The general formula for the present value of 

future money is P=F/(l+i)^ (Riggs, 1987, p. 132). Here, P is present dollars; F is future 
dollars; 1 is a constant; i represents the opportunity cost (the amount of interest that could have 
accrued on that money now instead of three years from now); and n is the number of years 
being considered. Using the same $100,000 value, and assuming 10% per year on the money, 
the result is P=100,000/(l+.10)^, or $75,131.48. It can seen that the $100,000 in three years is 
worth only about 75% of what it would be worth if received today. 

On the other hand, the same $100,000 must be invested now for a program that will break 
even in three years. You want to know the future value of those present dollars. Riggs (1987) 

provided the formula for calculating the future value of present dollars as F=(l+i)'^ . So F= 

(I+.IO) works out to $133,100. If those dollars had been invested at 10% per year, in three 
years they would have been worth $133,100. Therefore, the value of the training program after 
three years must exceed $133,100 in order to show any gain. This concept is sometimes called 
a "hurdle rate" (Hendrick & Moore, 1985, p. 52). A training intervention, or educational or 
HRD program, must equal or exceed that figure in value; or it cannot be asserted that it has 
had any real benefit. 


Utility Analysis 

This method measures the value of job performance after training. One version is based 
upon a method used by Godkewitsch (1987) that was later published by ASTD in INFO-LINE 
publication #007 (1990), entitled "How to Conduct a Cost-Benefit Analysis". The formula for 
training benefits is B = N[(E xV)-C], and the result is a dollar amount that describes what the 
training is worth to the organization in terms of the performance of the workers who have 
participated in the training (Brauchle, 1995). In this formula, N is the number of people 
trained; E is the effect of the training in standard deviation units; V is the dollar value of the 
effect; and C is the cost of the program. $uppose a training intervention has raised the average 
performance of workers +.42 standard deviations, and the dollar value of that effect was 
$9,600. There were 25 workers trained at a per-person cost of $2,502. Using this information 
with the formula B = N[(E xV)-C], B = 25 [(.42 x $9600) - $2,502], which works out to a 
benefit of $38,250. This means that the training program had a benefit of $38,250. 

The advantages of using this method are as follows. 

1. It uses dollars as the unit of measure for the analysis. 

2. This unit of measurement is usually acceptable to managers because they understand it. 

3. The method is relatively simple to use. 
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4. It does not require extensive computer capability or number crunching power. 

5. It provides results fairly quickly, without waiting for a one-year budget report. 

On the other hand, the disadvantages of using this method are as follows. 

1. This a norm-referenced method, with several attendant problems. 

2. It relies on supervisor evaluations, which may be of doubtful validity and even more 
questionable reliability. 

3. The value of one standard deviation improvement (40 percent of annual salary, based on 
other studies) is questionable, and should not be used if there are local benchmarks 
against which to compare it. 

4. The dollars calculated as benefits are not real dollars unless there are validated 
benchmarks. 

5. It may be difficult to tie the benefits to long-term business results if the measures of 
productivity gain are taken shortly after training. 

6. The results do not answer the question, "Did you train for the right skills?" (Brauchle, 
1995) 

In summary, this method is thought to be " . . . probably worth using if you can establish 
your own benchmarks for the value of one standard deviation of productivity gain. (It) is less 
complicated than many others and may provide useful justifications for the training 
enterprise" (Brauchle, 1995, p. 17-22). 

360-Degree Appraisal Feedback 

A considerable amount of money is spent annually on supervisor and manager training, 
and some related literature has focused on the concept of 360-degree feedback as a method of 
evaluating the degree to which executive performance has improved. Antonioni (1996) 
defined 360-degree feedback as a process whereby "...individuals appraise themselves and 
also receive appraisal feedback from their appraisers: immediate supervisor, peers, and from 
direct contributors if they are managers" (p. 72). This methodology, according to Charney and 
Conway (1998), "...is a performance evaluation system that uses input from all employee 
levels to assess performance. Training is an important factor in helping employees use the 
process effectively" (p. 196). In recent years this process has been popular enough to support 
the generation of considerable software in support of it (Bracken, Summers, & Fleenor, 1998; 
Fried, 1998, Meade, 1999; Ellis, 2001). It is believed to have at least five desirable outcomes: 

1. Increased awareness of the appraisers' expectations; 

2. Improvements in work behaviors and performance; 

3. Reduction of undiscussables, specifically the appraisers' feelings and perceptions about 
undesired behaviors of those being appraised; 

4. Increase in effective periodic informal 360-degree performance reviews; and 

5. Increase in organizational learning (Antonioni, 1996). 

An effective 360-degree feedback process offers a major opportunity for organizational 
members to improve the quality of their work relationships. It can help appraisers and those 
being appraised effectively define the quality requirements in their work relationships. The 
model teaches individuals how to give and receive constructive feedback, and provides a 
structure for discussing the undiscussables (Antonioni, 1996). It is thought to not only improve 
work behaviors, but also to increase worker performance, which in turn should improve return 
on investment for the organization. However, Antonioni did not explain exactly how this 
process could actually be translated into monetary outcomes. Church and Bracken (1997) 
attempted to relate the results of 360-degree feedback to organizational performance or results. 
Although 360-degree feedback is generally accepted as a good evaluation system that has the 
potential to contribute to improved performance, it has not yet been conclusively shown how 
this method can yield monetary returns on education or training programs. However, it may 
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have some useful application in education. For example, suppose teachers are evaluated 
by administrators, peers, and students. While it does not readily provide monetary data, this 
approach might put in perspective, or balance out, negative evaluations by any of these 
sources. 


Performance-Learning-Satisfaction Evaluation System (PLS) 

Another model thought to be useful in evaluating how well training has improved 
executive or supervisor performance, the PLS model embraces the domains of performance, 
learning, and satisfaction. It is thought to be both rigorous and flexible: rigorous in terms of 
core questions, techniques, and reporting results, and flexible enough for most applications. It 
also includes an auxiliary data processing program (Swanson & McClernon, 1996). When 
using this system, performance is considered in terms of two factors: (a) business results at the 
organizational, process, or individual levels; and/or (b) financial results or benefits in terms of 
money or monetary ratios. Swanson and McClernon asserted that most business results can be 
"monetized" (p. 719) and expressed as financial data. The general financial results goal of the 
model is to have benefits exceed the costs by a 2:1 ratio. Once the 2:1 return on investment 
goal is achieved, it can be said that the program achieved 100% of its goal. In that sense it 
appears to have more application value than 360-degree feedback. 

In the PLS System, learning has two components: (a) knowledge demonstrated on tests 
and other measurements, and/or (b) expertise in demonstrating simulated or actual workplace 
environments. Satisfaction is seen in terms of perceptions of the behaviors that are 
demonstrated. Trainees are expected to meet learning and expertise standards. Swanson and 
McClernon (1996) further state that if the standard is a rating of two on a one-to-three scale, 
an average score of two by the trainees would represent 100% attainment of the goal. 
Attention to raising the goal should occur only after there is clear evidence that existing 
performance and learning goals are being reached. Although the PLS Evaluation System 
depicted in the 1996 manuscript resulted in an 8:1 financial return on investment in less than a 
year, this method does not seem to have been used much in recent years. 

The Balanced Scorecard 

The balanced scorecard, developed by Kaplan and Norton, has been getting serious looks 
by players in the financial- and strategic-planning world (Willyerd, 1997). This method tracks 
a company's strategy and helps link its financial budgets to its strategic goals (Kaplan & 
Norton, 1996). Could it be adapted for use in educational institutions? 

The balanced scorecard deals with four key areas of performance and provides answers to 
the following questions. 

1. How do we look to our shareholders? 

2. How do customers see us? 

3. What must we excel at? 

4. Can we continue to improve and create value? 

This approach helps ensure that all of the critical performance measures are evaluated in 
addition to return-on-investment issues. It serves the purpose of a check and balance so that 
one area is not overemphasized at the expense of another. Unlike the PLS System, this method 
has enjoyed numerous recent applications in strategic planning (Kaplan & Norton, 2001; 
Kaplan & Norton, 2000; Anonymous, 1999) and in the evaluation of program effectiveness 
(Novak, 2000; Abernathy, 1999). However, it has been criticized by Forbes (2000) for not 
addressing the external and internal indicators that measure the value-added elements of the 
system. Forbes apparently thought the addition of these factors would create a more robust 
system. 
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HRD Benefit-Forecasting Model 

The basic HRD benefit-forecasting model uses three components to determine financial 
value: the performance value to result from the HRD program, the cost of the HRD program, 
and the benefit resulting from the HRD program (Swanson & Gradous, 1988). Performance 
value is the dollar value of the performance units resulting from an HRD program. Benefit is 
calculated by multiplying the total number of units expected to result from the program by the 
dollar-value amount of one unit. Cost is an item of outlay that is incurred in the operation of 
HRD programs; it may include such items as direct and indirect costs, fixed and variable costs, 
and amortization. HRD cost is any expenditure that the organization chooses to attribute to an 
HRD program. 

This approach is useful in that it contributes useful metrics for concepts like performance 
unit, performance value, and benefit. It is not a post hoc method of evaluation, but it does not 
seem to suffer much from this condition. Forecasting benefits can yield a realistically derived 
dollar value expected from an HRD or training program. From the numbers on a forecasting 
worksheet, other measures such as expected ROI and payback time can be calculated. The 
projected numbers can then be compared to the actual numbers. This kind of comparison can 
make forecasting efforts much more accurate and believable. Could this be used in education? 

Possibly, if performance unit and performance values can be quantified for teachers and 
students. 


The Relative- Aggregate-Scores Approach 

This approach, like the time-value-of-money method previously discussed, is borrowed 
from production operations management (POM). Gaither (1996) used it to help compare 
alternative locations for a production facility. It can be used to compare the relative value of 
training for different tasks or duties within a job, and/or for assessing the gain in value for that 
job after training. When the relative aggregate scores approach is used to assess the benefits of 
training, characteristics such as the difficulty and the importance of various job tasks are 
assessed, usually on a 1-4 scale. For each job task, weights are derived by multiplying 
difficulty ratings times importance ratings times frequency figures that are expressed as a 
percentage of time spent on a task. These products are converted to a percentage of the total, 
and the percentages become the weights for each job task. Then, actual data are used to show a 
performance value for each task. The performance value for each task is calculated by 
multiplying each weight times the total salary for the job. Ratings by supervisors before and 
after training enable the computation of pre-and post-performance values for each task, as well 
as a performance gain for each task and for the job as a whole (Brauchle, 1992). 

The results of this approach can be used in several ways. The weights for each job task 
can help in creating a kind of a Pareto analysis of the job so it can be seen which tasks are the 
most significant. They can also indicate where it may be desirable to place training or HRD 
dollars in order to obtain the best results. The relative aggregate scores approach necessitates 
weighing and considering what various portions of the job are actually worth, and provides 
actual figures for a gain in performance value. It can help in monitoring the results of training 
and in planning training so that limited resources are invested in the most important areas. 
Technical educators could use it to assess the value of mastery of various tasks in a program. 
Administrators could use it to assess the value of teacher improvement in various categories 
from in-service training programs. 

Unemployment Insurance (UI) Wage Studies 

Most of the preceding methods for evaluating HRD programs are good for microanalysis, 
focusing on just one individual, program, or job; but macro-analysis methods, those that cast a 
much wider net, are also available. One of these is known as the UI Wage Study (Kornfeld & 
Bloom, 1999). In a UI Wage Study, blocks of independently gathered data are merged and are 
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subjected to post hoc analysis to ascertain whether individuals' post-training salaries 
showed a gain from their pre-training salaries: The use of salaries as a measure of program 
effectiveness is appealing for several reasons. 

1. Salaries use dollars as the unit of measure for the analysis. 

2. The dollar is a ratio measure. For example, $100 is twice as much as $50. An interval 
measure, like that used in the questionnaire, does not permit such statements. Just 
because a thermometer uses an interval scale, it cannot be said that a temperature of 100 
degrees is twice as hot as 50 degrees. The ratio scale enables the expression of 
differences with more precision, therefore making more meaningful statements possible. 

3. The use of dollars as a measurement is usually acceptable to managers because they 
understand that metric and frequently use it to compare options. 

The UI Wage $tudy is based on the notion that most organizations have some kind of 
computerized record system for tracking their people. For example, every educational 
institution has some sort of number assigned to each student. Linked to that number are the 
data on what courses were taken and when, as well as other information. Other sources of data 
are the LFnemployment Insurance Wage Reports that are posted quarterly in each state. If the 
$ocial Security number of an individual is known and can be matched to educational records, 
the gains in salary for individuals after training can be computed by comparing educational 
records with Unemployment Insurance Wage Reports. $alary gains can be calculated for 
individuals, by program, or for completers and non-completers. 

This kind of study has several advantages over other commonly used methods of 
evaluating training programs. First, it is less expensive than costly and cumbersome surveys. 

$econd, it is as good or better at obtaining high-quality data and not nearly as costly (Kornfeld 
& Bloom 1999). Third, it appears to be able to much more accurately represent a population 
than traditional survey based approaches. Sanchez, Laanan, and Wisely (1999) found that 
from 71% to 84% of occupational program completers had been found in UI Wage data bases. 

This makes them very useful for analyzing the effects of occupational programs and should be 
beneficial for other programs as well. For example, the City Colleges of Chicago studied 
student earnings in a 1997 cohort of students from Truman College (Brauchle & Hastings, 

2003). The district was able to track changes in earnings for students who had taken individual 
courses, received certificates, or obtained associate degrees. They were also able to track 
earnings by gender, by age, for students from an economically disadvantaged background, and 
by vocational major. It appears that this method of establishing the monetary benefits of 
training or HRD programs has enormous potential to provide compelling evidence for learner 
and program achievement. 


A Perspective on Evaluation Methods 

Many HRD practitioners believe that the ultimate level of organization, return on 
investment, shows the true contribution of training (Phillips, 1996). The process is not 
complete "...until the results have been converted to monetary values and compared with the 
cost of the program" (p. 20). Four distinct and important benefits can be derived from the 
implementation of effective evaluation measures in an organization. First of all, ROI will 
measure the contribution the program made to the organization and will determine if it was a 
good investment. $econd, this calculation will determine which programs contribute the most 
to the organization and allow priorities to be established for high-impact training programs. 
Third, the evaluation brings a focus upon the results of all programs, not just those targeted for 
the financial evaluation. Fourth, this process can help convince management that training or 
education is an investment and not just an expense (Phillips, 1997b). Another benefit, this one 
not mentioned by Phillips, is that most of these methods express the results of training 
programs in terms of dollars, a metric that is of common interest to managers and decision 
makers. 
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Conclusion 

"One of the most common criticisms of trainers is that they do not measure the 
organization's return on investment in training" (Mendosa, 1995, p. 66). "The statistics 
everyone wants, those that would tell us the return on training dollars spent, has proven to be 
stubbornly elusive" (Fagiano, 1995, p. 12). A number of methods have been described in this 
manuscript that help organizations evaluate the benefits of training, including several models 
that have been used, or are currently being used, to evaluate return on investment. These 
methods are generally 


Table 2 


Summary of Assessment Methods 


Methodology 

Data 

Results 

Applied 

Level 

Value 

Benefit/cost ratio 

Hard 

Gives ratio of cost to benefits 

Before or 
after 

4 

High 

Payback period 

Hard 

Provides time to payback of 
initial investment 

Before or 
after 

4 

High 

Return on true value of 
dollars 

Hard 

Given % return on initial 
investment 

Before and 
after 

4 

High 

Present value of dollars 
and future value of dollars 

Hard 

Accounts for value of future 
savings in present dollars 

Before 

4 

High 

Utility analysis 

Hard 

Net value of HRD 

After 

4 

Med 

360-degree feedback 

Soft 

Provides estimates of 
performance improvement from 
various sources 

After 

3/4 

Low 

Performance team 
satisfaction 

Semi- 

soft 

Estimates knowledge and 
expertise developed from HRD 
program 

After 

3/4 

Low 

Balanced scorecard 

Soft 

Relates various performance 
measures to strategic issues 

Before and 
after 

3/4 

Low 

HRD benefit forecasting 

Hard 

Estimates relative value of HRD 
program approaches 

Before 

3/4 

High 

Relative aggregate scores 

Hard 

Gomputes relative values of job 
and task value 

After 

4 

High 

UI Wage Studies 

Hard 

Provides actual earnings one, 
two, and three years out 

After 

4 

High 


thought to fit within the "Results" level of Kirkpatrick's four-level evaluation model. Most 
HRD, training, and educational professionals agree with Parry (1996) that "...training doesn't 
cost ...it pays, and HRD is an investment, not an expense" (p. 72). These approaches are being 
used more and more by companies to show that there is a high return on the investment made 
in training and development (Purcell, 2000). Education might be well served to take note of 
this and to become better at using these techniques. As Table 2 indicates, a wide variety of 
methods exist to show the value of education and training; and each has its advantages and 
disadvantages. A careful mixture of these approaches is likely to be useful for most evaluation 
problems. 

As early as 1985, Laird had stated that "client-managers need measurement as an 
indication that they're solving or eliminating performance problems... getting something back 
on the training investment" (p. 241). In today's economy with down- sizing and world-wide 
global competition among public and private organizations, it is very important to be able to 
justify training expenses, as well as all expenses of these organizations. All of these expenses 
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must relate to organizational performance growth, profit, market share, etc.; and the 

dollars must contribute to this bottom-line measurement. 
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